13 research outputs found

    Register Allocation Via Coloring of Chordal Graphs

    Full text link
    Abstract. We present a simple algorithm for register allocation which is competitive with the iterated register coalescing algorithm of George and Appel. We base our algorithm on the observation that 95 % of the methods in the Java 1.5 library have chordal interference graphs when compiled with the JoeQ compiler. A greedy algorithm can optimally color a chordal graph in time linear in the number of edges, and we can eas-ily add powerful heuristics for spilling and coalescing. Our experiments show that the new algorithm produces better results than iterated regis-ter coalescing for settings with few registers and comparable results for settings with many registers.

    Register Allocation After Classical SSA Elimination is NP-Complete

    Full text link
    Abstract. Chaitin proved that register allocation is equivalent to graph coloring and hence NP-complete. Recently, Bouchez, Brisk, and Hack have proved independently that the interference graph of a program in static single assignment (SSA) form is chordal and therefore colorable in linear time. Can we use the result of Bouchez et al. to do register allocation in polynomial time by first transforming the program to SSA form, then performing register allocation, and finally doing the classical SSA elimination that replaces φ-functions with copy instructions? In this paper we show that the answer is no, unless P = NP: register allocation after classical SSA elimination is NP-complete. Chaitin’s proof technique does not work for programs after classical SSA elimination; instead we use a reduction from the graph coloring problem for circular arc graphs.

    A Framework for Persistence-Enabled Optimization of Java Object Stores

    No full text

    A Local-View Array Library for Partitioned Global Address Space C++ Programs

    No full text
    Multidimensional arrays are an important data structure in many scientific applications. Unfortunately, built-in support for such ar-rays is inadequate in C++, particularly in the distributed setting where bulk communication operations are required for good per-formance. In this paper, we present a multidimensional library for partitioned global address space (PGAS) programs, supporting the one-sided remote access and bulk operations of the PGAS model. The library is based on Titanium arrays, which have proven to pro-vide good productivity and performance. These arrays provide a local view of data, where each rank constructs its own portion of a global data structure, matching the local view of execution common to PGAS programs and providing maximum flexibility in struc-turing global data. Unlike Titanium, which has its own compiler with array-specific analyses, optimizations, and code generation, we implement multidimensional arrays solely through a C++ li-brary. The main goal of this effort is to provide a library-based im-plementation that can match the productivity and performance of a compiler-based approach. We implement the array library as an extension to UPC++, a C++ library for PGAS programs, and we ex-tend Titanium arrays with specializations to improve performance. We evaluate the array library by porting four Titanium benchmarks to UPC++, demonstrating that it can achieve up to 25 % better per-formance than Titanium without a significant increase in program-mer effort. 1

    From Flop to Megaflops: Java for Technical Computing

    No full text
    this article we show how optimizing array bounds checks and null pointer checks creates loop nests on which aggressive optimizations can be used. Applying these optimizations by hand to a simple matrix-multiply test case leads to Java-compliant programs whose performance is in excess of 500 Mflops on a four-processor 332MHz RS/6000 model F50 computer. We also report in this article the effect that various optimizations have on the performance of six floating-point-intensive benchmarks. Through these optimizations we have been able to achieve with Java at least 80% of the peak Fortran performance on the same benchmarks. Since all of these optimizations can be automated, we conclude that Java will soon be a serious contender for numerically intensive computing
    corecore